[Previous] [Next] [Index] [Thread]

Re: Site Scaning & IP graps



>Can anyone tell me what the best way of detecting if a spider has had a look
>at your server i.e is there a list of common spiders.

Check your server's agent_log files (not the referer_log, nor
access_log files).  You'll see umpteen different browser types
listed (typically hundreds of different versions), plus the identities
of all the different spiders.  If you're unsure which IDs are
browsers vs. which are spiders, check one or more of the browserwatch
pages which list the different IDs you'll find in the agent_log files.

You may have to search through a week or so's worth of logs to see
most of the spiders visiting your site.  Many of the spiders that
visit will probably not check your site daily.  Ah, note that you
really have no way of telling that a visting spider is searching
for new pages vs. simply doing HEAD requests to check that indexed
pages are still there, rather than searching for new pages.